Towards Duration Invariance of i-Vector-based Adaptive Score Normalization
نویسندگان
چکیده
It is generally conceded that duration variability has huge effects on the biometric performance of speaker recognition systems. State-of-the-art approaches, which employ ivector representations, apply adaptive symmetric (AS) scorenormalizations to improve the performance of the underlying system by using specific statistics on reference and probe templates obtained from additional datasets. The incorporation of duration information turns out to be vital in order to prevent a significant raise of entropy, since variation and likely a reduction of the signal duration from reference to probe samples is unpredictable. In this paper we propose a duration-invariant extension of the AS-Norm, which is capable of computing more robust scores over a wide range of duration variabilities. The presented technique requires less computational effort at the time of speaker verification, and yields a 19% relative-gain in the minimum detection costs on the current NIST i-vector challenge database, compared to the provided NIST i-vector baseline system.
منابع مشابه
Analysis of mutual duration and noise effects in speaker recognition: benefits of condition-matched cohort selection in score normalization
The biometric and forensic performance of automatic speaker recognition systems degrades under noisy and short probe utterance conditions. Score normalization is an effective tool taking into account the mismatch of reference and probe utterances. In an adaptive symmetric score normalization scheme for state-ofthe-art i-vector recognition systems, a set of cohort speakers are employed to calcul...
متن کاملEvaluation of i-vector Speaker Recognition Systems for Forensic Application
This paper contributes a study on i-vector based speaker recognition systems and their application to forensics. The sensitivity of i-vector based speaker recognition is analyzed with respect to the effects of speech duration. This approach is motivated by the potentially limited speech available in a recording for a forensic case. In this context, the classification performance and calibration...
متن کاملSpeech Enhancement by Modified Convex Combination of Fractional Adaptive Filtering
This paper presents new adaptive filtering techniques used in speech enhancement system. Adaptive filtering schemes are subjected to different trade-offs regarding their steady-state misadjustment, speed of convergence, and tracking performance. Fractional Least-Mean-Square (FLMS) is a new adaptive algorithm which has better performance than the conventional LMS algorithm. Normalization of LMS ...
متن کاملPrediction of soil cation exchange capacity using support vector regression optimized by genetic algorithm and adaptive network-based fuzzy inference system
Soil cation exchange capacity (CEC) is a parameter that represents soil fertility. Being difficult to measure, pedotransfer functions (PTFs) can be routinely applied for prediction of CEC by soil physicochemical properties that can be easily measured. This study developed the support vector regression (SVR) combined with genetic algorithm (GA) together with the adaptive network-based fuzzy infe...
متن کاملImproving short utterance based i-vector speaker recognition using source and utterance-duration normalization techniques
A significant amount of speech is typically required for speaker verification system development and evaluation, especially in the presence of large intersession variability. This paper introduces a source and utterance-duration normalized linear discriminant analysis (SUN-LDA) approaches to compensate session variability in short-utterance i-vector speaker verification systems. Two variations ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014